Numerous studies have emerged claiming that AI, robotics and other forms of smart automation will displace US jobs in the coming 10 to 15 years. This digital revolution has unleashed a new wave of advanced machines, further automating complex tasks and jeopardizing skilled workers in positions once considered difficult to automate. Research increasingly shows that these disruptive technologies – predictive analytics, artificial intelligence, the Internet of Things, automation and robotics – are not only becoming better, but are also being combined to increase productivity and growth. This report highlights significant differences in the degree of automatability of jobs by industry sector.
How susceptible are jobs to automation and what these shifts might mean for future employment?
Datasets:
Education attainment by occupation - U.S. Bureau of Labor Statistics
Wages by occupation - U.S. Bureau of Labor Statistics
# Importing packages
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import seaborn as sns
import matplotlib.pyplot as plt
from pywaffle import Waffle
from wordcloud import WordCloud,STOPWORDS
import fuzzymatcher
from docx import Document
# Importing the occupation, employment size & growth and education data
xls = pd.ExcelFile("occupation.xlsx")
oc = pd.read_excel(xls, sheet_name = "Table 1.7", header = 2, skipfooter = 4 )
oc = oc.drop(oc.columns[[1]],axis = 1)
oc.columns = ["Employment Title","Occupation type","Employment 2019","Employment 2029",
"Employment Absolute Change","Employment Percentage Change","Percentage Self-Employed'19",
"AverageOccupational openings'19-'29","Median annual wage'19","Education","Work Experience",
"On-the-job Training"]
oc = oc[oc["Occupation type"] == "Line item"]
oc = oc.rename({"Employment Title":"OCC_TITLE"}, axis = 1)
oc
# Importing the board job categories data, growth, salary
oc2 = pd.read_excel(xls, sheet_name = "Table 1.1", header =2, skipfooter = 5 )
oc2 = oc2.rename ({"Unnamed: 0":"OCC_TITLE","Unnamed: 6":"Median annual wage, 2019"}, axis =1)
oc2 = oc2.drop(columns = ["Unnamed: 1"], axis = 1)
oc2
# Importing wages data
wage = pd.read_excel("national_M2018_dl.xlsx" )
wage = wage[(wage["OCC_GROUP"] == "detailed")]
wage = wage.drop(columns = ["OCC_CODE"], axis =1)
# Importing job automation data
auto = pd.read_excel("Degree_of_Automation.xls", header = 3)
auto = auto.rename ({"Occupation":"OCC_TITLE","Context":"AUTOMATION"}, axis =1)
def rate(x):
if x <= 30:
return "Least Automated"
elif 30 < x <= 45:
return "Moderately Automated"
else:
return "Highly Automated"
auto["DEGREE"] = auto["AUTOMATION"].apply(rate)
auto = auto.drop(columns = ["Code"], axis =1)
auto.sort_values("OCC_TITLE")
# Importing probability of job automation data
document = Document('Probability of Automation Data Set.docx')
tables = []
for table in document.tables:
df = [['' for i in range(len(table.columns))] for j in range(len(table.rows))]
for i, row in enumerate(table.rows):
for j, cell in enumerate(row.cells):
if cell.text:
df[i][j] = cell.text
tables.append(pd.DataFrame(df))
doc = pd.concat(tables)
doc.columns = doc.iloc[0]
doc = doc.iloc[1:,:]
doc = doc[doc["Routineness"]!="Routineness"]
doc = doc.drop(columns = ["No.","Stratum"], axis = 1)
doc = doc.rename({"Occupation" : "OCC_TITLE"},axis = 1)
doc
# Salary, employment and automation
df = wage[wage['OCC_TITLE'].isin(auto['OCC_TITLE'])]
df = df[["OCC_TITLE","TOT_EMP","A_MEAN"]]
df = pd.merge(df, auto, on ='OCC_TITLE')
df
# Education, percentage change in job demand, automation
df2 = fuzzymatcher.fuzzy_left_join(auto, oc, left_on = "OCC_TITLE", right_on = "OCC_TITLE")
df2 = df2[["AUTOMATION","OCC_TITLE_left","DEGREE","Employment 2019", "Employment 2029","Median annual wage'19",
"Employment Percentage Change","Education"]]
df2 = df2.rename ({"OCC_TITLE_left":"OCC_TITLE","Employment Percentage Change":"% CHANGE '19-'29",
"Education":"EDUCATION"}, axis =1)
df2["EDUCATION"] = df2["EDUCATION"].replace(['Some college, no degree'],'No formal educational credential')
df2["Median annual wage'19"] = df2["Median annual wage'19"].replace({">=$208,000":"208000","—":0})
df2 = df2.dropna()
df2
# Education, percentage change in job demand, automation for board job categories
df3 = fuzzymatcher.fuzzy_left_join(oc2, auto, left_on = "OCC_TITLE", right_on = "OCC_TITLE")
df3 = df3.iloc[1:,3:]
df3["OCC_TITLE_left"] = df3["OCC_TITLE_left"].str.rsplit(' ', 1).str[0]
df3
# Probability of automation
prob = doc.copy()
def degree(x):
if x < 0.1:
return "Low"
elif 0.1 <= x <= 0.9:
return "Moderate"
else:
return "High"
prob["Degree"] = prob["P(Auto)"].astype(float).apply(degree)
prob
# Percentage of jobs automated
waf = prob[["Degree","P(Auto)"]].copy()
value = dict(waf["Degree"].value_counts())
percent = dict(waf["Degree"].value_counts(normalize = True))
labels = [f'{k} Automation: {v*100:.1f}%' for k,v in percent.items()]
# Plotting the percentage of jobs based on degree of automation
fig = plt.figure(FigureClass = Waffle, rows = 14, values = value,
colors = ("#983D3D","#232066", "#DCB732"),icons = 'child',
icon_size = 22, icon_legend = True,font_size = 20,
legend = {"loc": (1.05,0.5), "labels":labels,'fontsize': 12},
title = {"label":"Percentage of jobs automated",'loc': 'left', "fontsize": 15},
figsize = (12, 8))
plt.show()
An increasing number of jobs in the US are adopting automation and new techonolgies across all sectors. 23% of current jobs in the US are at high risk of automation, 55.7% of the jobs fall into the moderate automation range, and only 23% of jobs are currently least affected by automation.
# Plotting the types of jobs automated based on degree of automation
fig = px.scatter(prob,x = prob["OCC_TITLE"],y = "P(Auto)",hover_data = ['OCC_TITLE','Degree'], color = 'Degree')
fig.update_layout(title = " Degree Of Automation", yaxis_title = "Probability Of Automation",
xaxis_title = " ", template = "plotly_white", font = dict(size = 12),
legend = dict(orientation = "h",yanchor = "bottom",y = 1.02,
xanchor = "right",x = 1, title = " "))
fig.update_traces(hovertemplate = prob["OCC_TITLE"])
fig.update_layout(margin = {'l':0,'r':0})
fig.show()
Top Executives, Scientists, Civil Engineers, Physicians, Therapists, Artists, Social Workers, Astronomists & Physicists jobs are largely confined to the low-risk category. Whereas the probability of job automation is seen to increase as jobs shift towards manual labor centric roles.
# Plotting strength of relationshp between job complexity & job automation
from scipy.stats import pearsonr
corr, _ = pearsonr(doc["P(Auto)"].astype(float), doc["Complexity"].astype(float))
print('Pearsons correlation: %.3f' % corr)
fig = px.scatter(doc, x = "P(Auto)", y = "Complexity", trendline = "ols",template = "ggplot2")
fig.update_traces(hovertemplate = doc['OCC_TITLE'])
fig.update_xaxes(title = "P(Automation)")
fig.update_layout(margin = {'l':0,'r':0,'t':0,'b':0})
fig.show()
The above graph depicts a clear inverse correlation between the complexity of job roles and the probability of job automation, indicating that less complex jobs are at higher risk of automation.
# Plotting strength of relationshp between job routiness & job automation
corr, _ = pearsonr(doc["P(Auto)"].astype(float), doc["Routineness"].astype(float))
print('Pearsons correlation: %.3f' % corr)
fig = px.scatter(doc, x = "P(Auto)", y = "Routineness", trendline = "ols", template = "ggplot2")
fig.update_traces(hovertemplate = doc['OCC_TITLE'])
fig.update_xaxes(title = "P(Automation)")
fig.update_layout(margin = {'l':0,'r':0,'t':0,'b':0})
fig.show()
Whereas, there is a strong positive correlation between routine intensive jobs and probability of job automation. Routine jobs governed by a clear set of steps or a process will make it easier for computers and machines to replicate.
# Plotting job wages and degree of automation
data = df.copy()
data = data.replace({"*":0})
data.dtypes
data.sort_values(by="A_MEAN",ascending=False)
data["RANGE"] = pd.cut(data["A_MEAN"],bins = 4)
data["RANGE"] = data["RANGE"].astype(str).replace({"(133510.0, 200265.0]": "133510 - 200265",
"(66755.0, 133510.0]":"66755 - 133510",
"(200265.0, 267020.0]":"200265 - 267020",
"(-267.02, 66755.0]":"0 - 66755"})
job = data.copy()
job = job.groupby(["RANGE","DEGREE"])[["OCC_TITLE"]].count().reset_index()
colors = {"Highly Automated":"#d62728","Moderately Automated":"#ff7f0e","Least Automated":"#17becf"}
fig = px.bar(job, x = "RANGE", y = "OCC_TITLE", color = "DEGREE", barmode = "group",color_discrete_map = colors)
fig.update_layout(title = "Job Automation By Salary",xaxis_title = " ",yaxis_title = "Number of jobs",
legend = dict(orientation = "h",yanchor = "bottom",y = 1.02,xanchor = "right",
x = 1, title = ""), template = "plotly_white",
xaxis = {"categoryorder":"total descending"})
fig.update_traces(hovertemplate = "Jobs: %{y:}K")
fig.update_yaxes(ticksuffix = "K")
fig.update_xaxes(tickprefix = "$")
fig.show()
There is a clear distinction between job wages and likelihood of job automation; low-wage jobs face higher probability of automation while higher-wage jobs are less prone to automation.
# Plotting boxplot for job wages and degree of automation
fig = px.box(data, x = data["RANGE"], y = "AUTOMATION",hover_data = ['OCC_TITLE', 'DEGREE'], points = "all")
fig.update_layout(title = "",xaxis_title = "",yaxis_title = "Automation",legend_orientation = "h",
template = "plotly_white",xaxis = {"categoryorder":"total descending"})
fig.update_traces(marker_color = "#17becf")
fig.update_xaxes(tickprefix = "$")
fig.show()
Lower salaries have wide automation ranges indicating larger variation in job automation rates. Automation ranges are smaller among higher salary groups. While the medians are the same in the lower two salary ranges, the absolute numbers of job automations in these salary groups are higher, which may suggest greater job losses to automation.
# No. of jobs automated and education
edu = df2.copy()
edu = edu.groupby(["DEGREE","EDUCATION"])["OCC_TITLE"].count().reset_index()
# Plotting numbers of jobs automated based on education
colors = {"Highly Automated":"#d62728","Moderately Automated":"#ff7f0e","Least Automated":"#17becf"}
fig = px.bar(edu, x = edu["EDUCATION"].astype(str), y = edu["OCC_TITLE"],color = edu["DEGREE"].astype(str),
barmode = "group",color_discrete_map = colors)
fig.update_layout(title = "Job Automation By Education Levels",xaxis_title = " ",yaxis_title = "Number of jobs",
legend = dict(orientation = "h",yanchor = "bottom",y = 1.02,xanchor = "right",
x = 1, title = ""),template = "plotly_white",
xaxis = dict(tickmode = 'array',
tickvals = ["High school diploma or equivalent",
"Bachelor's degree", "No formal educational credential",
"Doctoral or professional degree","Postsecondary nondegree award",
"Associate's degree","Master's degree"],
ticktext = ["High school <br> diploma", "Bachelor's <br> degree",
"No formal <br> education","Doctoral or <br> professional degree",
"Postsecondary <br> nondegree", "Associate's <br> degree",
"Master's <br> degree"]))
fig.update_traces(hovertemplate = "Jobs: %{y:}K")
fig.update_yaxes(ticksuffix = "K")
fig.show()
# Plotting percentage of jobs automated based on education
edu_pcts = edu.copy()
edu_pcts = edu_pcts.groupby(["EDUCATION","DEGREE"]).agg({"OCC_TITLE": 'sum'})
edu_pcts = edu_pcts.groupby(level = 0).apply(lambda x:round(100 * x / float(x.sum()),2)).reset_index()
edu_pcts
colors = {'Highly Automated':'#d62728','Moderately Automated':'#ff7f0e','Least Automated':'#17becf'}
fig = px.bar(edu_pcts, x = edu_pcts["EDUCATION"].astype(str), y = edu_pcts["OCC_TITLE"],
color = edu_pcts["DEGREE"].astype(str), barmode = 'group',color_discrete_map = colors)
fig.update_layout(title = "Job Automation By Education Levels(%)",xaxis_title = " ",
yaxis_title = "Number of jobs",legend = dict(orientation = "h",yanchor="bottom",
y = 1.02,xanchor = "right",x = 1, title = ""),
template = "plotly_white",xaxis = dict(tickmode = 'array',
tickvals = ["High school diploma or equivalent",
"Bachelor's degree",
"No formal educational credential",
"Doctoral or professional degree",
"Postsecondary nondegree award",
"Associate's degree","Master's degree"],
ticktext = ["High school <br> diploma",
"Bachelor's <br> degree",
"No formal <br> education",
"Doctoral or <br> professional degree",
"Postsecondary <br> nondegree",
"Associate's <br> degree",
"Master's <br> degree"]))
fig.update_traces(hovertemplate = "Jobs: %{y:}%")
fig.update_yaxes(ticksuffix = "%")
fig.show()
Although percentage of job automation risk for all education levels is moderate to low. Jobs requiring less educational attainment have higher percentage of automated jobs, indicating workers with higher educational attainment would be less vulnerable in the long run.
# Plotting boxplot for jobs automated based on education
fig = px.box(df2, x = df2["EDUCATION"].astype(str), y = "AUTOMATION", points = "all",
hover_data = ['OCC_TITLE', 'DEGREE'])
fig.update_layout(title = "",xaxis_title = "",
yaxis_title = "Automation",legend_orientation = "h",
template = "plotly_white",xaxis = dict(tickmode = 'array',
tickvals = ["High school diploma or equivalent", "Bachelor's degree",
"Associate's degree","No formal educational credential",
"Postsecondary nondegree award","Doctoral or professional degree",
"Master's degree"],
ticktext = ["High school <br> diploma", "Bachelor's <br> degree",
"Associate's <br> degree","No formal <br> education",
"Postsecondary <br> nondegree","Doctoral or <br> professional degree",
"Master's <br> degree"],categoryorder = "total descending"))
fig.update_traces(marker_color = "#17becf")
fig.show()
Median percentage of automation rates are lower as the educational levels increase. The main exceptions are the no formal education groups, which do not follow the same trend and the associate's degree which have higher automation rates. Having a higher level graduate degree protects against potential automation related job losses.
# Employment size
emp = fuzzymatcher.fuzzy_left_join(oc, prob, left_on = "OCC_TITLE", right_on = "OCC_TITLE")
emp = emp.iloc[:,3:]
emp["Median annual wage'19"] = emp["Median annual wage'19"].replace({">=$208,000":"208000","—":0})
emp = emp[(emp["Median annual wage'19"] != 0)]
emp = emp.rename({"OCC_TITLE_left":"Occupation"}, axis = 1)
# Plotting employment size based on median salary and education
fig = px.scatter(emp, x = "Median annual wage'19", y = "P(Auto)",color = 'Education',
size = "Employment 2019",size_max = 40,hover_data = ['Occupation',"Employment 2019"])
fig.update_yaxes(title = "P(Automation)")
fig.update_xaxes(title = "")
fig.update_layout(template = "plotly_white",legend_orientation = "h")
fig.update_layout(margin = {'l':0,'r':0,'t':0,'b':0})
fig.update_xaxes(tickprefix = "$")
fig.show()
A large population of employees in the lower wage group with low education level are highly vulnerable to job displacement in the future. We also notice a few high paying jobs such as accountants, sales representatives and managers (general/service/operational) could also face job losses due to increased automation of repetitive tasks.
# Employment growth rate in absolute
fig = go.Figure()
fig.add_trace(go.Scatter(x = df3["2019"], y = df3["OCC_TITLE_left"],
marker = dict(color = "crimson", size = 12),
mode = "markers", name = "2019"))
fig.add_trace(go.Scatter(x = df3["2029"], y = df3["OCC_TITLE_left"],
marker = dict(color = "gold", size = 12),
mode = "markers", name = "2029"))
fig.update_layout(title = dict(text ='Employment Growth Rate(Absolute)',
x = 0.4, xanchor='center', y = 0.98,yanchor = 'top'),
xaxis_title = "Number of jobs",yaxis_title = "",template = "ggplot2",
height = 700, width = 1000,font = dict(size = 10),
legend = dict(orientation = "h",yanchor = "bottom",
y = 1.00,xanchor = "right",x = 1, title = ""))
fig.update_layout(margin = {'l':0,'r':0,'t':0,'b':0})
fig.show()
# Employment growth rate in percentages
df3["Color"] = np.where(df3["Percent"]<0, 'crimson', 'teal')
fig = go.Figure()
fig.add_trace(go.Bar(x = df3["OCC_TITLE_left"], y = df3["Percent"],marker_color = df3["Color"],
text = df3["Percent"], textposition = "outside"))
fig.update_xaxes(showticklabels = False)
fig.update_layout(title=dict(text ='Employment Growth Rate(Percentage)',
x = 0.24, xanchor = 'center', y = 0.98,yanchor ='top'),
template = "ggplot2",height = 600, width = 1000,hoverlabel_font_color = 'white')
fig.update_yaxes(title = "Percentage Growth", ticksuffix = "%")
fig.show()
The future employment growth is likely to shift towards occupations like Healthcare, Social Service, Computer & Mathematical sectors which deal with non-routine tasks and require high cognitive skills and interpersonal skills. Whereas occupations like Production, Sales support, Farming & Agricultural sectors which are based on extensive routine task and follow explicit, systematic procedures will see a declined growth rate in the coming years.
As companies embrace and implement automation, the workforce will be elevated to more complex and creative job functions.
More businesses will focus on integration of human centric and AI centric processes.
Highly creative or technical positions are most likely to prevail which require cognitive skills, interpersonal skills and emotional intelligence.
Low-wage earners will be among the first to see their jobs disappear, since many of their tasks are routine based.
Future workforce may need to switch occupations, acquire new skills and may also require to obtain higher educational qualifications.
Moreover, the ongoing pandemic would force more businesses to speed up the job automation process. The labor-saving technologies could become permanent, displacing millions of jobs in the future.
from PIL import Image
Image1 = Image.open("McKinsey.png");
Image1
Image2 = Image.open("Business Insider.png");
Image2